%  Tables_7_BLS_MFP_data_for_BPEA_2017_05.m 
%  Revised by Fernald for Fernald, Hall, Stock, and Watson, BPEA 2017.
%  Writes out Table 7 (Industry productivity) and prepares the data for Table 8 (Regdata)

%  WARNING: Won't work properly in Matlab for OSX because it makes extensive use of the 
%  Excel COM Server to write out and format spreadsheet.

%  Updated with industry data through 2014 

%  Analyzes industry MFP data from the BLS
%  The code was originally written for "Productivity and Potential Output" (NBER Macro Annual)
%  Written by John Fernald/Bing Wang 5/2014

%  Parts were then used for "The Pre-Recession Slowdown in Productivity" (Cette, Fernald,
%  Mojon) and "Does the U.S. Have a Productivity Slowdown or a Measurement Problem?"
%  Hence, there is presumably a lot of extraneous code here...


% if needed:
% system(' taskkill /F /IM excel.exe ')
  
%  Data sources include 
%   \BLS MFP Data	Manuf. And nonmanuf data from the BLS
%   \BLS I-O		Input-output data, used to calc finance shares
%   \IT capital		Data on IT composition of industry cap. Input

% Other directories are
%    \Data			Has a template for output of the program, and saves temp files
%    \Out			Output of the program is written here



%  Section 1 reads in the BLS MFP data
%       Throughout, industry numbering (1-70) corresponds to the "original" MFP numbering
%       at least after the BLS manufacturing data are stacked on top of the non-manufacturing 
%       data (which the BLS provides in separate spreadsheets--which my RA merged manually).
%       I've added some additional "aggregates" (aka, "newaggs") below that. For example:
%   
%           71 = private business
%           72 = IT producing
%           73 = non-IT producing
%           74 = well measured
%           75 = poorly measured
%
%       Mnemonics in the code mostly mean something to John.  dX means 100*log-growth in X.  A double first
%       letter usually means average share in t and t-1, e.g., if VAWt is the value-added weight 
%       of the industry, VVAWt is the average value-added weight in adjacent periods.  
%   
%       I recalculate all aggregates from the underlying industry data.  The reason is that 
%       the BLS aggregates (e.g., manuf, or information, or services) as reported are NOT necessarily
%       consistent with the underlying industry data.  E.g., manufacturing value-added is
%       not the sum of the manufacturing industries' value-added.  Steve Rosenthal at the BLS
%       told me (over the phone, Jan 17, 2014) that they use different data sources for the MF
%       aggregate and industry data. They also prefer to remove within-aggregate intermediate
%       flows from reported TFP, and I prefer to leave it in.  But I report everything in
%       value-added terms, so it shouldn't matter in principle (though could in practice,
%       because the BLS data are not internally consistent.)
%
%       In a number of places in the program, I aggregate using fisher_jgf.m.  This procedure
%       does Fisher/chain aggregation (since real quantities are not additive!!).  It was 
%       originally produced by FRBSF Macro RAs, but then I modified it slightly to write out the  
%       nominal values. (This is a very close approximation to the Tornquist, which I'd usually prefer, but
%       I'm happy to use the Fisher).
%
%       Note that it is very slow (90 seconds on my C: drive) to read the BLS data from large spreadsheets; but very
%       fast to do everything else.  So I save the variables from the spreadsheet after reading it, and then if
%       readmfp = false, it simply reads in that saved output. (Changed 3/2017 to just save variables, not settings).
%
%
%  Section 2 reads the BLS input-output dataset to produce finance share and finance growth
%   
%       Doing so means taking the nominal and real "use" matrices from the BLS input-output tables,
%       and aggregating to the level used by the BLS MFP data.  
%           * The I-O matrices are 196 x 196 x 20 (one matrix for each of 20 years), which is a
%           hassle to deal with.  So I do some intermediate steps to simplify
%  
%       This involves two levels of aggregation.
%           * First, we aggregate across use of financial 'commodities' (5 of the 196 industries in the
%               I-O tables).  This gives us finance use by the 196 industries.  That matrix is
%               20 x 196 (so it follows the convention of years being in rows, which I try to follow)
%           * Then we aggregate industries to match BLS MFP industries. 
%               The mapping "key" is a big cell array called agglist.  That array also has the
%               names of the industries, in case we want to use it for making tables or the
%               like.  

%  Section 3 reads IT capital as produced by the BLS
%
%       The original purpose of this (for Fernald's Macro Annual paper) was to be able to see the extent to which the slowdown in productivity
%       reflected, at least to some extent, the plucking of the low-hanging fruit of the IT
%       revolution.  So I need IT intensity and IT usage by industry!  In BFOS 2003, we had a
%       particular regression specification that we recommended running. This section produces 
%       the data to run that regression. (Though the code doesn't actually run this)


% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%  Industry MFP dataset ordering/numbering	
%  NOTE:  agglist, in Section 2, maps the 192 BLS input-output industries to this list
%  newaggs, a list defined near the top of the program, adds the additional aggregates to this
%  list.

% #    NAICS    Name

% 1     MN   Manufacturing Sector
% 2     ND    Non-Durable Manufacturing Sector
% 3     311,312	  Food and Beverage and Tobacco Products
% 4     313,314	  Textile Mills and Textile Product Mills
% 5     315,316	  Apparel and Leather and Applied Products
% 6     322	  Paper Products
% 7     323	  Printing and Related Support Activities
% 8     324	  Petroleum and Coal Products
% 9     325	  Chemical Products
% 10	326	  Plastics and Rubber Products
% 11	DM	 Durable Manufacturing Sector
% 12	321	  Wood Products
% 13	327	  Nonmetallic Mineral Products
% 14	331	  Primary Metals
% 15	332	  Fabricated Metal Products
% 16	333	  Machinery
% 17	334	  Computer and Electronic Products
% 18	335	  Electrical Equipment, Appliances, and Components
% 19	336	  Transportation Equipment
% 20	337	  Furniture and Related Products
% 21	339	  Miscellaneous Manufacturing
% 		
% 22	11	Agriculture, Forestry, and Fishery 
% 23	111,112	 Crop & Animal Production 
% 24	113-115	 Forestry, Fishing, and Related Activities 
% 25	21	Mining 
% 26	211	 Oil and Gas Extraction 
% 27	212	 Mining, except Oil and Gas 
% 28	213	 Support Activities for Mining 
% 29	22	Utilities 
% 30	23	Construction 
% 31	42,44-45	Trade 
% 32	42	 Wholesale Trade 
% 33	44,45	 Retail Trade 
% 34	48-49	Transportation and Warehousing 
% 35	481	 Air Transportation 
% 36	482	 Rail Transportation 
% 37	483	 Water Transportation 
% 38	484	 Truck Transportation 
% 39	485	 Transit and Ground Passenger Transportation 
% 40	486	 Pipeline Transportation 
% 41	487,488,492	 Other Transportation and Support Activities 
% 42	493	 Warehousing and Storage 
% 43	51	Information 
% 44	511,516	 Publishing Industries 
% 45	512	 Motion Picture and Sounds Recording Industries 
% 46	515,517	 Broadcasting and Telecommunications 
% 47	518,519	 Information and Data Processing Services 
% 48	52-53	Finance, Insurance, and Real Estate 
% 49	521,522	 Federal Reserve Banks, Credit Intermediation, and Related Activities 
% 50	523	 Securities, Commodity Contracts, and Investments 
% 51	524	 Insurance Carriers and Related Activities 
% 52	525	 Funds, Trusts, and Other Financial Vehicles 
% 53	531	 Real Estate 
% 54	532,533	 Rental and Leasing Services and Lessors of Intangible Assets 
% 55	54-81	Services 
% 56	5411	 Legal Services 
% 57	5415	 Computer Systems Design and Related Services 
% 58	5412-5414,5416-5419	 Miscellaneous Professional, Scientific, and Technical Services 
% 59	55	 Management of Companies and Enterprises 
% 60	561	 Administrative and Support Services 
% 61	562	 Waste Management and Remediation Services 
% 62	61	 Educational Services 
% 63	621	 Ambulatory Health Care Services 
% 64	622,623	 Hospitals and Nursing and Residential Care Facilities 
% 65	624	 Social Assistance 
% 66	711,712	 Performing Arts, Spectator Sports, Museums, and Related Activities 
% 67	713	 Amusements, Gambling, and Recreation Industries 
% 68	721	 Accommodation 
% 69	722	 Food Services and Drinking Places 
% 70	81	 Other Services, except Goverment 


clear all
%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%   Section 0 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%   Settings to modify

c_drive    = false;                % True if John's local c:\ drive, false if run off of the public s drive
readmfp    = false; % It's slow to read in the BLS spreadsheets.  If false (the default), then read a saved version,
                 % If true, it reads the BLS MFP data and saves the output as bls_mfp.mat
                 % If modify the code, or file locations, may need to set to true (to resave
                 % the file)
               
        
                 
if 1 == c_drive % these are settings to run off John's local drive, and assumes that subdirectory structure exists
    base_location = ['C:\Users\l1jgf01\Documents\Papers\BPEA_Hall, Stock, Watson\Data\Tables_7_and_8 (industry and regdata)'];
else % Change this to run it off the Dropbox folder
    base_location = ['W:\research\Brookings FHSW 2017\ReplicationFiles_ConferenceDraft\matlab\Tables_7_and_8 (industry and regdata)'];
end
s_drive_location = [base_location];
s_drive_library = [base_location '\m_library'] ;


                 
%_drop by default drops NR (nat resources), C (construction), and Fin (finance).  For finance intensive, maybe want to drop just IT and finance 
% Also might be true for IT intensity?
just_drop_IT_from_finance = true;                 
                 
% The code works with the following aggregate lists.  Additional sub-aggregates can be added to
% newaggs at the bottom.  The first column is the number (so 71=business), the second column tells 
% Matlab what to aggregate.  For example, 'business' will be defined below as a vector of
% industries, as will 'itprod'.  The third column gives a descriptive name.

newaggs = ...
            {'71' , 'business',             'Private business'
            '72' ,  'itprod'  ,             'IT producing'      
            '73' ,  'nonit',                'Non-IT producing'
            '74' ,  'well',                 'Well measured'
            '75' ,  'poor' ,                'Poorly measured'
            '76' ,  'it_int',               'IT-intensive'
            '77' ,  'notit_int',            'Non-IT intensive'
            '78' ,  'fin_int'               'Finance intensive'
            '79' ,  'notfin_int'            'Non-finance intensive' 
            '80'	'8:10'                  'Petroleum, chemicals, and plastics'
            '81'    '3:7'                   'Other non-durables'
            '82'	'12:14'                 'non-equip durables (metal, mineral, wood)'
            '83'	'[15:21]'               'equipment'    % I also experiemented with "equipment ex computers" by dropping 17
            '84'	'[35:39]'               'Passenger and freight transport' 
            '85'	'[40:42]'               'Pipelines, warehousing, other'
            '86'	'[44 46]'               'Software and communications'
            '87'	'[45 47]'               'Other informat. (not publ., broadcast.)'
            '88'	'49:52'                 'Finance and Insurance'
            '89'	'53:54'                 'Real estate and leasing'
            '90'	'56:61'                 'Professional, technical, and support'
            '91'	'62:65'                 'Educ, health, and soc assist'
            '92'	'66:70'                 'Entertainment, accomm., and other'       
            '93'    'narrow_business'       'Business (ex NR-C-F)'  
            '94' ,  'it_int_drop',          'IT-intensive (ex NR-C-F AND IT-prod)'
            '95' ,  'notit_int_drop',       'Non-IT intensive (ex NR-C-F and IT-prod)'
            '96' ,  'fin_int_drop'          'Finance intensive (ex NR-C-F and IT-prod)'
            '97' ,  'notfin_int_drop'       'Non-finance intensive (ex NR-C-F and IT-prod) ' 
            '98'	'[15:16 18:21]'         'Equipment, exc comp. and semicond'
            '99' ,  'narrowest_business'    'Business (ex NR-C-F and IT-prod)'
            '100',  '[8 23:24 26:28]'          'Natural resources (NR, i.e., ag. and mining)'
            '101'   '[30 53 54]'             'Construction and real estate'
            '102'   '[23:24 26:28 30 49:54]' 'Nat. resources (NR), constr., FIRE (NR-C-F)'
            '103'   'non_it_drop'            'Non-IT prod. (ex NR-C-F)' 
            '104' ,  'well_drop',            'Well measured (ex NR-C-F and IT-prod)'
            '105' ,  'poor_drop' ,           'Poorly measured (ex NR-C-F  and IT-prod)'   
            '106' ,  '[12:16 18:21]',        'Durables (ex. comp. and semicond.)'   
            '107' ,   'it_int25' ,           'IT-intensive top 25% by weight'
            '108' ,   'it_int2550' ,         'IT-intensive 35-50% by weight'
            '109' ,   'it_int5075' ,         'IT-intensive 50-75% by weight' 
            '110',    'it_int7500'      ,    'IT-int 75-100%'    
            '111',    'it_int33',           'IT-int top 1/3'
            '112' ,   'it_int3367'  ,       'IT-int middle 1/3'
            '113' ,   'it_int6700'  ,       'IT-int lowest 1/3'
            '114',     '[9 10 12 13 14 15 27 28]',   'DJ1' 
            '115',     '[3 4 5 6 20 23 24]',         'DJ2'
            '116',      '[32 33 35 39 42 44 45 56 58 59 60 62 66 67 68 69 70]',  'DJ3'
            '117',      '[49 50 51 52 53 54]'    ,   'DJ4'
            '118',      '[63 64 65]'             ,   'DJ5'
            '119',      '[7 16 18 19 21 30 36 37 38 41 47 61]',   'DJ6'
            '120',      '[8 26 40]',                 'DJ7'
            '121',      '[17 57]',                   'DJ8'
            '122',      '46' ,                       'DJ9'
            '123',      '29',                        'DJ10'       
            '124',      'mkt_services'     'Mkt services excl. finance'
            '125'       'not_mkt_services'  'Not mkt services (ex NR-C-F and IT-prod'
            '126'       '[63 64]'     'Health care'
            '127'       '[56:58]'   'Legal, computer, prof'      % to use with 3-digit regdata          
            '128'       '[8 26 29 40]' 'Energy (refining, oil&gas extraction, utilities, pipelines)'
            '129'       'non_finance'   'Business ex. finance'
            '130'       'list_for_reg'  'industries with regdata'
            '131'       'non_finance_it'    'Business ex. finance and IT prod'
            '132'       'transport'     'Transportation (ex. pipelines)'
                      };
            

  

%   DJ (Dow Jones) Groups are from "BLS_MFP_categories_to_Bloomberg.xlsx" which Bing sent me on 8/11/2015

% Groupings of industries: in form of indices in agglist (list of industries and descriptions)
% (we calculate VA TFP growth, accel, etc. other measures on these later for tables)
% These categories become "industries" 71-79 in table

% These are the industries that exist for the regdata dataset on regulation. About 80
% percent of value added...
list_for_reg=    [ 23	24	26	28	29	30	13	13	14	17	18	19	12	21	3	4	6	8	9	32	33	...
    35	36	37	38	40	41	42	46	49	50	51	52	53	127	59	61	62	63	65	67	70];

trade    = [32 33] ; % or as 31.  Sometimes I want to aggreg WT and RT, others not. I always treat them together, e.g., in IT intensive
business = [3:10 12:21 23:24 26:30 trade 35:42 44:47 49:54 56:70]; % combines 32, 33.  
                                                            % Also change 'well' and 'it_int'
                                                            % and 'it_int_drop'
                                                          

itprod   = [17 44 57]; %IT producing
nonit    = business(~ismember(business,itprod));
non_finance = business(~ismember(business,[49:52]));
non_finance_it = business(~ismember(business,[49:52 itprod]));

transport = [35:39]; % excludes pipelines, which are in energy. For other purposes, will want it

well     = [3:10 12:21 23:24 26:29 trade 35:42 46 68 ];%46 68 69 ]; % follows Griliches 1994 and Nordhaus 2002
                    % "well" = mf, ag, mi, trans, communi, pub util.  I went back and forth 
                    % on "trade" (32:33), which Griliches excludes but Nordhaus includes.  In
                    % the end, I included it as well measured.
                    % What about accommodation (68) and food and beverage (69)?  Nordhaus
                    % includes accommodation (68) but not food services (69)
                    % broadcasting and telecomm (46) (but not motion pictures?)  
                    % robustness 
                    % well = [3:10 12:21 23:24 26:29 35:42 ]  % Narrow, supports footnote in text
                                                            % that can switch trade to poor,
                                                            % and broadcasting/68/69
poor = business(~ismember(business,well)); % poor is those elements of business that aren't members of the 'well' group

mkt_services = [trade 35:42 45:47 56 58:60]; % mkt services

% IT-intensive and non-intensive, finance-intensive and non-intensive
% Sections 2 and 3, below, calculate sskvfull (IT share) and FinSharefull.  Section 4 uses
% those calculations to order it-intensive and non-it-intensive.  But it's useful to have
% the lists first.  So the lists were pasted here.
% Note the main list it_int is a subset of business, so includes NR-C-F and it-producing.  My 
% preferred list for the paper omits NR-C-F and IT-prod, and is labeled as _drop; created below)
%it_int = [3,8,9,16,17,18,19,21,28,29,32,35,37,39,40,44,45,46,47,49,50,51,54,57,58,59,60,62,63,64;];

it_int=       [46 47 40 37 59 35 63  9 16 58 39 64 61 29 19 18 trade] ;%ssk_9500, 32 33 combined


%it_int = [46 47 54 44 49 59 37 9 50 58 63 51 40 57 29 17 45 32 16 60 21 8 35]; % in order of intensity, total of 50% VA wt
%it_int = [46 47 54 44 49 59 37 51 50 63 58 40 17 57  9 60 16 64 35 29 31]; % in order of intensity, total of 50% VA wt


notit_int = business(~ismember(business,it_int));
excluded = business(~ismember(business,[it_int notit_int]));
assert(isempty(excluded),'IT int and notit_int do not include everyone')

fin_int=[36 37 39 70 54 59 64 38 23 29 63 65 67 56 53  5 58 60 28 57 66 32 27 ];% in order of intensity, total of 50% VA wt...except that I moved 33 
notfin_int =[33 68 61 45 20 40 15 62 44 69 21 41 35 16 14 13  4 47 24 10 42 18  7 26 30 12 17  3 46 19  9  6  8];

%aggregs= [1 2 11 22 25 31 34 43 48 55]; % aggregates. E.g., 1=manufacturing, 2=durables, 31=wholesale and retail trade, etc
% Want to exclude some industries that were 'unusual' (bubble industries)
drop_from_groups = [8 23 24 26:28 30 49:53]; % ag, mining, construction (30) AND FIRE (48:54) 
                                                % 8/2015 Added petroleum refining (like bubble)
narrow_business = business(~ismember(business,drop_from_groups));
narrowest_business = business(~ismember(business,[drop_from_groups itprod] ));  % drops FIRE and IT prod

% IT intensity, full sample, in order

% "it_int_drop" is what I mainly use.  
% I have done IT intensity several ways.  The most "standard" is to use shares, sskv.  The BFOS
% way is to use share-weighted growth skdk.  They are similar, but a couple of differences

it_int_drop_sskv  =[46 47 59 37 63 58 40  9 60 16 64 35 29 trade]; % Just shares
it_int_drop_skdk = [46 47 35 16 59 60 37 58  9 39 40 trade 19] ; % BFOS waybased on 95-00, combine 32 33
%   Old 2014 values = [47,46,59,58,60,9,35,32,63,62,21,40,45,8,16,64,41] ;
    
it_int_drop = it_int_drop_skdk;  % Defines whether I use sskv or skdk

notit_int_drop = narrowest_business(~ismember(narrowest_business, it_int_drop)); 
not_mkt_services = narrowest_business(~ismember(narrowest_business, mkt_services)); 

% 25% weight  => stop at 63
% 50% stop at 35
% 75% stop at 33 
% These cutoffs are on the "drop" list (excl NR-C-F and ITprod)
it_int25 =   [46 47 59 37 63 58 40] ;   
it_int2550=  [ 9 60 16 64 35 29] ;  
it_int5075 = [ 21  8 18 39 62 19 61  3  6 31 13 ]; 
it_int7500 = [ 56  7 15 10 41  5 20 45 38 12 14 68 70 67  4 42 66 69 65 36]; 

it_int33   = [46 47 59 37 63 58 40 9];
it_int3367 = [32 60 16 64 35 29 21  8 18 39 62 19 61  3  6] ;
it_int6700 = [33 13 56  7 15 10 41  5 20 45 38 12 14 68 70 67  4 42 66 69 65 36];

fin_int_drop = [ 36 70 59 65 38 37 61 67 40 60 63 56 58 64 66 68 33 39 32];
%fin_int_drop = fin_int(~ismember(fin_int, [drop_from_groups 49:52]));  % also drop finance 
notfin_int_drop = narrowest_business(~ismember(narrowest_business, fin_int_drop)); 


% Just drop IT
if just_drop_IT_from_finance;
    fin_int_drop=[36 37 39 70 54 59 64 38 23 29 63 65 67 56 53  5 58 60 28 66 32 27 ];% in order of intensity, total of 50% VA wt...except that I moved 33 
    notfin_int_drop =[33 68 61 45 20 40 15 62 69 21 41 35 16 14 13  4 47 24 10 42 18  7 26 30 12 3 46 19  9  6  8];
    
    it_int_drop=  [46 40 37 59 35 63  9 16 58 39 64 61 29 19 18 trade] ;%ssk_9500, 32 33 combined
    notit_int_drop = business(~ismember(business,[it_int_drop itprod 49:52]));
end
 
non_it_drop =  nonit(~ismember(nonit,[drop_from_groups itprod] ));  % drops IT prod
well_drop = well(~ismember(well,[drop_from_groups itprod] ));  % drops IT prod
poor_drop = poor(~ismember(poor,[drop_from_groups itprod] ));  % drops IT prod
        


% Settings below shouldn't need to be changed until there's a new BLS dataset

format shortg

MFPstartY = 1987;  % MFP industry data.  The IT cap data have same years, 
MFPendY   = 2014;  % change if not true

IOstartY = 1997;   % Input-output table start year       
IOendY   = 2014;  

ITcapstartY = 1987; 
ITcapendY = 2014;  % Available through 2013


this_date = datestr(now,'yyyy.mm.dd'); % don't use this yet, numeric to aid sorting if needed

% add paths to libraries/etc.
addpath(s_drive_library)


IO_location = [s_drive_location filesep 'BLS-I-O'];
it_cap_location = [s_drive_location filesep 'IT capital'];

MFP_location = [s_drive_location filesep 'data']
    
cd(s_drive_location)

save('out\Section0.mat')
save([MFP_location filesep 'mfp_start_values.mat'], 'MFPstartY', 's_drive_location', 'MFP_location') % a few things I'll need later

% functions used chadfig...



%%
%  ****************************************************************************
% Section 1:  Read in MFP Data
%  ****************************************************************************

% Bing Wang organized the BLS MFP data so that the "BLS_Raw_MasterFile.xlsx" spreadsheet has just the
% variables we want, with manufacturing on top and non-manufacturing below.  Those sheets have
% industries reading down, and years reading across.  This section of the code 
%       * reads the sheet information from the Excel file, 
%       * loops through the sheets naming the variables (matrices) from the sheet names.  
%       * It then drops the first row (which is the year name) and transposes so matrices are years x industries 


% If bls_mfp.mat doesn't exist, then need to set readmfp = true to create it (that's
% slow, but only needs to be done once).


clear all;
%cd('data') % For some reason, this works more robustly than doing load('out\Section0.mat') directly
          % my conjecture is that it has something to do with Matlab compiling...but I don't
          % know.  Possibly a rehash would help?
load('out\Section0.mat') 



    if readmfp; % This is slow, takes 92 seconds from my C: drive
                % When run, it clears everything to save JUST the BLS Raw Master File
        tic;
        
        cd(MFP_location)  % put us in the proper directory
        
        clear all;        % now can clear
     
        load('mfp_start_values.mat')  % we're already in the right directory, so reads

        mfpfile='BLS_Raw_MasterFile.2016.xlsx';
       [status,sheets] = xlsfinfo(mfpfile) ;
        % Check that start year is correct
        eval(['temp = xlsread(mfpfile,' num2str(2) ');']) ;% first row it reads has the numeric years
        assert(abs(temp(1,1)-MFPstartY)<0.01, 'Starting year of mfpfile does not match MFPstartY')

        for i=2:size(sheets,2);  % readme is sheet #1, so start at 2
           sheetname = sheets{1,i} ;
           eval([sheetname '= xlsread(mfpfile,' num2str(i) ');']) ;% first row it reads is the numeric year
           eval([sheetname '= transpose(' sheetname '(2:end,:)) ;' ]) % Drop first row (year) and transpose so it's years x industries
        end


      % Rename some variables
        Y   = Y_SectOutput;
        K   = K_CapServ ;
        Hrs = Hrs_LaborHrs ;  % In new data, this is quality adjusted labor input. Kept old mnemonic
        N   = N_Intermed ;
        E   = E_Energy ;
        M   = M_Mat ; % materials
        S   = S_PurchServ; %services
        X   = X_ComboInputs ;

        data_to_add = {'Y' ; 'K'; 'Hrs'; 'N';'E'; 'M';'S';'X'};
        data_to_save = [sheets(1,2:end)'; data_to_add  ] ; % sheets(:,1) is readme, so start with column 2
       
        save('bls_mfp.mat',data_to_save{:} )  % In MFP directory. Addded data_to_save here so just data not settings
        cd(s_drive_location)
        
        % Now reload all the 'preliminary' variables
        load('out\Section0.mat')
        
        toc;
        
    end

    if ~readmfp;
       %clear all   % probably unnecessary
       load([MFP_location '\bls_mfp.mat'])

%       load('out\bls_mfp.mat')
%       load('out\newaggs.mat','newaggs')
    end

    % Fill in intermediate inputs for manufacturing, since the BLS provides materials,
    % services, and energy separately
    
    normalize_year = 23; % what row to use? % Bing: using 2009 here
    use_tornquist=true;  % how to aggregate
    for i = 1:21; % For manufacturing industries    
         realin = [M(:,i) E(:,i) S(:,i)];
         nomin = [PMM(:,i) PEE(:,i) PSS(:,i)];
         pricein = [PM(:,i) PE(:,i) PS(:,i)];

         if use_tornquist
            % Do the Tornquist instead
            dq=100*diff(log(realin));
            PNN(:,i) = PMM(:,i)+ PEE(:,i)+ PSS(:,i); % nominal shares
            shares =  [PMM(:,i) PEE(:,i) PSS(:,i)]./(repmat(PNN(:,i),1,3));
            sshares = 0.5*(shares(1:end-1,:)+shares(2:end,:)); % average in adjacent periods

            dN(:,i) = sum(dq.*sshares,2);
            N(:,i)=  exp(cumsum([0;dN(:,i)/100]));
            N(:,i) = 100*N(:,i)/N(normalize_year,i);
            PN(:,i) = PNN(:,i)./N(:,i);
            PN(:,i) = 100*PNN(:,i)/PNN(normalize_year,i);
         else
            [N(:,i), PNN(:,i), PN(:,i), dum] = fishagg_jgf(realin, nomin, pricein, normalize_year,0) ; % JGF modified 1/2014
            N(:,i)=100*N(:,i)/N(normalize_year,i); % not necessary, but normalize to 100 in base year (to match non-mf)
         end      
    end
    %  Big discrepancy here for N in manufacturing in some cases
    % plot([100*diff(log(fisher_dolmas(realin,pricein))) 100*diff(log(N(:,8)))  ]) ;
    % Reason is huge variation   [dE(:,8) dM(:,8) dS(:,8)]
    % In the end, I concluded this didn't matter.
    
    
    % A couple 1987 variables are missing for mining ex oil and gas (#27).  That throws off the chaining.  I ought to change my chaining routine to 
    % deal with that. If needed    any(any(isnan(PLL(:,business))) );
    % In any case, fill them in here--won't matter for anything I care about
       
    PL(1,27) = PL(2,27); % Assume same "price" (wage) in years 1 and 2
    PLL(1,27) = PLL(2,27)*(Hrs(1,27)/Hrs(2,27)) ; % This makes nominal growth (growth in price times quantity, PLL) equal real growth
    
    PK(1,27) = PK(2,27); % Assume same "price" (rental rate)
    PKK(1,27) = PKK(2,27)*(PK(1,27)/PK(2,27)) ;
    
    growth_vars = {'Y'
        'K'
        'Hrs'
        'N'
        'E'
        'M'
        'S'
        'X'
        'TFP'};

    for i = 1:size(growth_vars,1)  % growth rates at annual rate
       eval(['d' growth_vars{i,1} '=100*diff(log(' growth_vars{i,1} ') ) ;']) 
    end
    
    
    % Now calculate some things I want. 
    NomVA = PYY-PNN;
    PVV = NomVA; % More consistent mnemonic
    AgVA = sum(NomVA(:,business),2);
    
    DomarWt = PYY ./ repmat(AgVA,1,size(PYY,2));
    VAWt = NomVA ./ repmat(AgVA,1,size(PYY,2)) ;
    
    SN = PNN./PYY;  % Will overwrite the data read in from BLS spreadsheet, which is only 3 digits
    SE = PEE./PYY;
    SM = PMM./PYY;
    SS = PSS./PYY;
    
    SK = PKK./PYY;
    SL = PLL./PYY;
    
    SKV = PKK./PVV; % capital share of value added
    SLV = PLL./PVV; % labor share of VA
    
    SSN     = ( SN(2:end,:)+SN(1:end-1,:) )/2;
    
    SSM     = ( SM(2:end,:)+SM(1:end-1,:) )/2;
    SSE     = ( SE(2:end,:)+SE(1:end-1,:) )/2;
    SSS     = ( SS(2:end,:)+SS(1:end-1,:) )/2;
        
    SSK     = ( SK(2:end,:)+SK(1:end-1,:) )/2;
    SSL     = ( SL(2:end,:)+SL(1:end-1,:) )/2;
    
    VVAWt   = ( VAWt(2:end,:)+VAWt(1:end-1,:) )/2;
    DDomarWt = ( DomarWt(2:end,:)+DomarWt(1:end-1,:) )/2;
    
    %dX_check = SK(2:end,:).*dK+SL(2:end,:).*dHrs+SN(2:end,:).*dN; % NaN for 1:21 (manufacturing, since N doesn't exist yet)
    dX_check = SSK.*dK+SSL.*dHrs+SSN.*dN; % NaN for 1:21 (manufacturing, since N doesn't exist yet)

    dTFP_check = dY-dX_check;
    
    aaa=mean( (dTFP_check-dTFP) )' ;  % discrepancy from what the BLS did is typically pretty small, 
    
    dTFPv_BLS = dTFP ./(ones(size(SSN))-SSN);  % Using the BLS reported TFP and intermediate share
    dTFPv = dTFPv_BLS; % may write over this later
    contrib = dTFPv_BLS.*VVAWt;   % VA weighted VA productivity
    contribDom = dTFP.*DDomarWt;  % Domar weighted gross-output productivity
    
    % Growth in value added
    dv(:,1:70) = (dY-SSN.*dN)./ (ones(size(SSN))-SSN);
    
    
    
 
    
    
    
    
    
    %corrcoef([contrib(:,1) contribDom(:,1)])   % confirms these are very, very, very similar
    
    

    %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
    
    % Now loop over aggregates and "newaggs" to fill in from the raw data
    
    % 
    
    Fill_in_mapping = ...   % Ensures that aggregates are consistent with the pieces as Tornquist
       {'2', '3:10', 'MN'   %  Example:  Industry "2" is manufacturing nondurables and will be aggregated from 3:10
        '11','12:21', 'MD'
        '22', '23:24', 'Ag'
        '25', '26:28', 'MI'
        '31', '32:33', 'Trade'
        '34', '35:42', 'Transp'
        '43', '44:47', 'Info'
        '48', '49:54', 'FIRE'
        '55', '56:70', 'Services' };

    Fill_in = [Fill_in_mapping ; newaggs];
    
    for i = 1:size(Fill_in,1)    
         eval(['NomVA(:,' Fill_in{i,1} ')= sum(NomVA(:,' Fill_in{i,2} '), 2);'])  % Note it's col 1 on LHS, col 2 on RHS
         eval(['VVAWt(:,' Fill_in{i,1} ')= sum(VVAWt(:,' Fill_in{i,2} '), 2);'])
         eval(['PKK(:,  '  Fill_in{i,1} ')= sum(PKK(:,   ' Fill_in{i,2} '), 2);'])
         eval(['PLL(:,  '  Fill_in{i,1} ')= sum(PLL(:,   ' Fill_in{i,2} '), 2);'])
           
         eval(['dTFPv(:,' Fill_in{i,1} ')= sum(dTFPv(:,' Fill_in{i,2} ').* VVAWt(:,' Fill_in{i,2} '),2 ) ./ VVAWt(:,' Fill_in{i,1} ');'])
         eval(['contrib(:,' Fill_in{i,1} ')=dTFPv(:,   ' Fill_in{i,1} ') .*VVAWt(:,' Fill_in{i,1} ');'])
    end
    
     
% Aggregate other vars that need chaining (Fisher chain aggregation)    
        
    for i = 1:size(Fill_in,1)
        LHS=['[Y(:,' Fill_in{i,1} '), PYY(:,'  Fill_in{i,1}  ') PY(:,'  Fill_in{i,1} '), dum ]'] ;
        RHS=[ 'fishagg_jgf(Y(:,' Fill_in{i,2} '),PYY(:,' Fill_in{i,2} '), PY(:,' Fill_in{i,2} '),normalize_year,0) ;'] ; 
        eval([LHS '=' RHS ';'] ) % fisher aggregation
        eval(['Y(:,' Fill_in{i,1} ') = 100*Y(:,' Fill_in{i,1} ')/ Y(normalize_year,'  Fill_in{i,1} ') ;']) % not necessary, but normalize to 100 in base year (to match non-mf)
        
        LHS=['[K(:,' Fill_in{i,1} '), PKK(:,'  Fill_in{i,1}  ') PK(:,'  Fill_in{i,1} '), dum ]'] ;
        RHS=[ 'fishagg_jgf(K(:,' Fill_in{i,2} '),PKK(:,' Fill_in{i,2} '), PK(:,' Fill_in{i,2} '),normalize_year,0) ;'] ; 
        eval([LHS '=' RHS ';'] ) % fisher aggregation
        eval(['K(:,' Fill_in{i,1} ') = 100*K(:,' Fill_in{i,1} ')/ K(normalize_year,'  Fill_in{i,1} ') ;']) % not necessary, but normalize to 100 in base year (to match non-mf)
                
        LHS=['[N(:,' Fill_in{i,1} '), PNN(:,'  Fill_in{i,1}  ') PN(:,'  Fill_in{i,1} '), dum ]'] ;
        RHS=[ 'fishagg_jgf(N(:,' Fill_in{i,2} '),PNN(:,' Fill_in{i,2} '), PN(:,' Fill_in{i,2} '),normalize_year,0) ;'] ; 
        eval([LHS '=' RHS ';'] ) % fisher aggregation
        eval(['N(:,' Fill_in{i,1} ') = 100*N(:,' Fill_in{i,1} ')/ N(normalize_year,'  Fill_in{i,1} ') ;']) % not necessary, but normalize to 100 in base year (to match non-mf)
        
        LHS=['[Hrs(:,' Fill_in{i,1} '), PLL(:,'  Fill_in{i,1}  ') PL(:,'  Fill_in{i,1} '), dum ]'] ;
        RHS=[ 'fishagg_jgf(Hrs(:,' Fill_in{i,2} '),PLL(:,' Fill_in{i,2} '), PL(:,' Fill_in{i,2} '),normalize_year,0) ;'] ; 
        eval([LHS '=' RHS ';'] ) % fisher aggregation
        eval(['Hrs(:,' Fill_in{i,1} ') = 100*Hrs(:,' Fill_in{i,1} ')/ Hrs(normalize_year,'  Fill_in{i,1} ') ;']) % not necessary, but normalize to 100 in base year (to match non-mf)
        
    end
    
    
dY = 100*diff(log(Y));
dN = 100*diff(log(N));
dK = 100*diff(log(K));
dHrs=100*diff(log(Hrs));

SN = PNN./PYY;  % will differ from what BLS labeled 'aggregates' because the BLS removes within industry movements
SL = PLL./PYY;
SK = PKK./PYY;
SLV  = PLL./NomVA;
SKV  = PKK./NomVA;
SSN  = (SN(1:end-1,:)  + SN(2:end,:))/2;
SSL  = (SL(1:end-1,:)  + SL(2:end,:))/2;
SSK  = (SK(1:end-1,:)  + SK(2:end,:))/2;
SSLV = (SLV(1:end-1,:) + SLV(2:end,:))/2;
SSKV = (SKV(1:end-1,:) + SKV(2:end,:))/2;


% Check growth accounting identities


    dTFP_check = dY(:,1:70)-dX(:,1:70);  % should be close to what BLS reports, and is
    %  And it is close:  [(1:70)' mean(dTFP - dTFP_check)'];  % will be more different for aggregates  
    
    % If I use M, E, S, I'm closer than using dN
    dX_alt = SSK.*dK + SSN.*dN + SSL.*dHrs;
    dX_alt1 = SSK(:,1:70).*dK(:,1:70)+SSL(:,1:70).*dHrs(:,1:70)+ SSM.*dM+SSE.*dE+SSS.*dS;  %
    dX- dX_alt1;
    
    dTFP_new = dY-dX_alt;  
    % [(1:70)' mean(dTFP - dTFP_new(:,1:70))']; 
    % Pretty close...
    
    dv_new = (dY - SSN.*dN)./(ones(size(SSN))-SSN);
    dv-dv_new(:,1:70);  % checks out.  Matches identically for disaggregated components.  Not sure why, say, Manuf durables isn't closer
    % why are some of these categories really off? Bing 5/28/15
    Capdeep = SSKV.*(dK-dHrs);
    dLP = dv_new-dHrs;
  
   % dX_alt = SSK.*dK + SSN.*dN + SSL.*dHrs;
   
 
      % Construction is highly labor intensive
    
    mean([SK(:,30) SL(:,30) SN(:,30) SKV(:,30) SLV(:,30)])
     
    % apply this stuff to business sector
    
    dTFPvbusDom = sum(dTFP(:,business).*DDomarWt(:,business),2);
    contribbus = dTFPv(:,71); % duplicative, but helpful below
  %  dTFPvbus = dTFPv(:,71); % in case useful
    
   
    % Memo:  The lines below do NOT use average shares, and match exactly, which is good
    % dTFPv = dTFP ./(ones(size(SN(2:end,:) ) )-SN(2:end,:) ) ;
    % dTFPbus = sum(dTFPv(:,business).*VAWt((2:end),business),2);
    % dTFPbusDom = sum(dTFP(:,business).*DomarWt((2:end),business),2);
    
   
    % Now do some checks of the data
    
     % mean([dTFPv(:,71) dTFPvbusDom dTFPv(:,71)-dTFPvbusDom])  % should be close, but not identical.  They are close.
     
    [sum(VVAWt(:,business),2) VVAWt(:,71)]; % had better be one!  (it is)
    
    check1 = contrib(:,72) + contrib(:,73);% it and not-it producing
    check2 = contrib(:,74)+contrib(:,75); % well and poor
    check3 = contrib(:,76) + contrib(:,77); % IT and not IT intensive
    check4 = contrib(:,78) + contrib(:,79); % finance and not fin intensive
    mean([check1 check2 check3 check4 dTFPv(:,71) dTFPvbusDom dTFPv(:,71)-dTFPvbusDom]) ;% should be close, but not identical.  They are close.
  
    % Weights should all sum to one    
    checkwt1 = VVAWt(:,72) + VVAWt(:,73);% it and not-it producing
    checkwt2 = VVAWt(:,74)+VVAWt(:,75); % well and poor
    checkwt3 = VVAWt(:,76) + VVAWt(:,77); % IT and not IT intensive
    checkwt4 = VVAWt(:,78) + VVAWt(:,79) + sum(VVAWt(:,[49:52]),2); % finance and not fin intensive. I excluded finance itself so need to add those
    
    assert(all(abs(checkwt1-1)<1e-6) & all(abs(checkwt2-1)<1e-6) & all(abs(checkwt3-1)<1e-6) & all(abs(checkwt4-1)<1e-6) ,'checkwt1, checkwt2...Weights do not sum to one'   )
  
    %  Need to checkwt2, checkwt4...
    

 
 
 
 
    
    
 cd(s_drive_location);   
 save(['out\section1.mat'])
    
    
    
    
 
 
 
    
    
   %% 
%  ****************************************************************************    
%  Section 2:  Read the IO data on finance shares
%  ****************************************************************************
%
%   The logic of this section is explained in the introduciton
%   To update, 
%       1. need to make sure the IOendY and IOstartY are correct. 
%       2. File path--move old file to an "OLD" directory, and copy the new/updated IO files to the same
%       original path. Check that capitalization matches the old name
%       3. Various places will need filenames changed slightly
%       4, Need to check the rows for finance industries and, if changed, the rows for mapping
%       to BLS industry numbers

    
    load([s_drive_location filesep 'out\section1.mat'])
    cd(IO_location)

    IOyears = IOendY-IOstartY+1;
    
    financecommods = [116:120];  % These are the finance commodities in I-O space, easy to modify here...

    % Now read nominal

    % Start with reading nominal totals.  Will need for intermediate shares
    nomy = [IO_location filesep 'IONom'];
    cd(nomy)

    %NOMINAL_OUTPUT_IND = csvread('NOMINAL_OUTPUT_IND9312.csv',1,1)';
    NOMINAL_OUTPUT_IND = csvread('NOMINAL_OUTPUT_IND9714.csv',1,1)';
                         % Confusingly, "1,1" means we start in B2, i.e., we drop the first row and column 
                         % (Matlab starts counting at zero), since text in cell A1 causes trouble
                         % Transpose so that years are in rows


    nominal = [IO_location filesep 'IONom\NOMINAL_USE'];
    cd(nominal)

    % Now read the nominal use tables, which require looping.  First get the dimensions and set
    % up the nomuse and realuse arrays (3 dimension)
    
    temp = csvread('NOMINAL_USE_1997.csv',1,1);  % get the dimensions.  
    sizenom = size(temp);
    nomuse = NaN(sizenom(1), sizenom(2),IOyears);  %3-dimensional array, year in third array
    realuse = nomuse; % start with NaN's of the same size
    
    % Now loop through the years to read them
    for i=1:IOyears;%startY:endY
            yr = IOstartY+i-1; % year
            file_to_read = ['NOMINAL_USE_' num2str(yr) '.csv'] ;
            eval('nomuse(:,:,i)=csvread(file_to_read,1,1);');
    end;

    %any(any(isnan(nomuse)))

    % Now read real use matrices
    real = [IO_location filesep 'IOReal\REAL_USE'];
    cd(real)

    temp = csvread('REAL_USE_1997.csv',1,1);  % get/confirm the dimensions.  
                                    % Confusingly, "1,1" means we start in B2, i.e., we drop the first row and column 
                                    % (Matlab starts counting at zero), since text in cell A1 causes trouble
    sizereal = size(temp);
    assert(all(sizereal ==sizenom), 'Size of REAL_USE does not match NOMINAL_USE')

    for i=1:IOyears; % looping
            yr = IOstartY+i-1; % year
            file_to_read = ['REAL_USE_' num2str(yr) '.csv'] ;
            %eval('realuse(:,:,i)=csvread(file_to_read,1,1);');  % I don't think we need an
            %eval
            realuse(:,:,i)=csvread(file_to_read,1,1);
    end;

    % Now compute a price as ratio of nominal to real.  Lots of NaNs (zero over zero), so set those to zero
    priceuse = nomuse ./realuse;
    priceuse(isnan(priceuse))=0;


    Nomfin   = NaN(IOyears, sizenom(2));  % These will be years x industries,
    Realfin  =  Nomfin;              %  where finance use by the industry is in the columns
    Pricefin = Nomfin;
    
    % For each industry, create a total finance usage (basically, summing 110-114)
    for i=1:sizenom(2)  % Now loop over columns (industries) 
        % First select the industry.  The "input" matrices below will have 
        % years in rows, and commodity use (by industry i) in the columns.

        nominput   = squeeze(nomuse(:,i,:))';
        realinput = squeeze(realuse(:,i,:))'; 
        priceinput= squeeze(priceuse(:,i,:))';

        % Now select the use-of-finance columns.
        % Note:  Could combine with lines above, except that misbehaved if we set financinds 
        % to a scalar, e.g., 110.  Then the "aqueeze" command would give us a 1x20, not 20x1
        nominput = nominput(:,financecommods);
        realinput = realinput(:,financecommods);
        priceinput= priceinput(:,financecommods);

        % Call the Fisher aggregation program.  The "13" means normalize levels to the 13th year,
        % which is 2005.  
        [realout, nomout, priceout, dum] = fishagg_jgf(realinput, nominput, priceinput,13,0) ; % JGF modified 1/2014

        % Now we set the column to be finance use in the industry
        Nomfin(:,i) = nomout;
        Realfin(:,i) = realout;
        Pricefin(:,i) = priceout;

    end

    % Column 169 is private households (NAICS 81), and seems to use no intermediates.  In the
    % aggregation above, that leads fishagg to return NaNs, not zeros.  That causes problems below
    % when I aggregate "other services" , so I set 169 it to zero
    settozero=[181];  % Could add other industries here but, for now, do not need to
    Nomfin(:,settozero)=0;
    Realfin(:,settozero)=0;
    Pricefin(:,settozero)=0;

    % Now aggregate into BLS MFP categories.  agglist maps to BLS MFP categories, with an
    % "additional" total private economy which is #71.

    %Columns of agglist are
    %    1. IO- Industries to be aggregated to match BLS MFP industires
    %    2. memo item of what the NAICS codes are
    %    3. memo item of the name
    % rows of agglist should be ordered to match the BLS MFP industry order

    % These are industries 1:71 of the BLS.  the first column shows the mapping from IO tables
    %'BLS_IO_Summary'	'NAICS_2007'	'SectorTitle'
    agglist = ...
    {'16:91'			'31-33'	'Manufacturing'
    '[16:28 32:44]'		'31, 322-326'	'Nondurable goods'
    '16:26'			'311, 312'	'Food, beverage and tobacco product manufacturing'
    '27'			'313, 314'	'Textile and textile product mills'
    '28'			'315, 316'	'Apparel, leather, and allied product manufacturing'
    '32:33'			'322'		'Paper manufacturing'
    '34'			'323'		'Printing and related support activities'
    '35'			'324'		'Petroleum and coal products manufacturing'
    '36:42'			'325'		'Chemical manufacturing'
    '43:44'			'326'		'Plastics and rubber products manufacturing'
    '[29:31 45:91]'	'321, 327, 33'	'Durable goods'
    '29:31'			'321'		'Wood product manufacturing'
    '45:48'			'327'		'Nonmetallic mineral product manufacturing'
    '49:53'			'331'		'Primary metal manufacturing'
    '54:62'			'332'		'Fabricated metal product manufacturing'
    '63:69'			'333'		'Machinery manufacturing'
    '70:75'			'334'		'Computer and electronic product manufacturing'
    '76:79'			'335'		'Electrical equipment, appliance, and component manufacturing'
    '80:86'			'336'		'Transportation equipment manufacturing'
    '87:89'			'337'		'Furniture and related product manufacturing'
    '90:91'			'339'		'Miscellaneous manufacturing'
    '1:6'			'11'		'Agriculture, forestry, fishing, and hunting'
    '1:2'			'111, 112'	'Farms'
    '3:6'			'113-115'	'Forestry, fishing, hunting, and related activities'
    '7:11'			'21'	'Mining'
    '7'			'211'       'Oil and gas extraction'
    '8:10'			'212'	'Mining, except oil and gas'
    '11'			'213'	'Support activities for mining'
    '12:14'			'22'	'Utilities'
    '15'			'23'	'Construction'
    '92:96'			'NaN'	'Trade'    
    '92'			'42'	'Wholesale trade'
    '93:96'			'44-45'	'Retail trade'
    '97:105'		'48, 492, 493'	'Transportation and warehousing'
    '97'			'481'           'Air transportation'
    '98'			'482'           'Rail transportation'
    '99'			'483'           'Water transportation'
    '100'			'484'           'Truck transportation'
    '101'			'485'           'Transit and ground passenger transportation'
    '102'			'486'           'Pipeline transportation'
    '103:104'		'487, 488, 492'	'Other transportation and support activities'
    '105'			'493'           'Warehousing and storage'
    '106:115'		'51'            'Information'
    '106:107'		'511'           'Publishing (incl. software)'
    '108'			'512'           'Motion picture and sound recording industries'
    '109:113'		'515, 517'          'Broadcasting and telecommunications'
    '114:115'		'518, 519'           'Information and Data Processing Services '
    '116:121'		'NaN'           'Finance, Insurance, and Real Estate '
    '116'			'521, 522'       'Credit intermed. and related activities'
    '117'			'523'            'Securities, commods, and other fin. invest. activities'
    '118:119'		'524'           'Insurance carriers and related activities'
    '120'			'525'           'Funds, trusts, and other financial vehicles'
    '121'			'531'           'Real estate'
    '122:125'		'532, 533'      'Rental and leasing services and lessors of intangible assets'
    '126:181'		'54'            'Services'  % changed by JGF to be all services. Maybe drop 181?
    '126'			'5411'          'Legal services'
    '130'			'5415'          'Computer systems design'
    '[126:129 131:134]'	'5412-5414, 5416-5419'	'Miscellaneous professional, scientific, and technical services'
    '135'			'55'            'Management of companies and enterprises'
    '136:143'		'561'           'Administrative and support services'
    '144'			'562'           'Waste management and remediation services'
    '145:147'		'61'            'Education services'
    '148:154'		'621'           'Ambulatory health care services'
    '155:156'		'622, 623'      'Hospitals and nursing and residential care facilities'
    '157:159'		'624'           'Social assistance'
    '160:164'		'711, 712'      'Performing arts, spectator sports, museums, and related industries'
    '165:167'		'713'           'Amusement, gambling, and recreation industries'
    '168'			'721'           'Accommodation'
    '169'			'722'           'Food services and drinking places'
    '170:181'		'81'            'Other services'  
    '1:181'         'PR'            'Private business'}   ;  % Private business at the end


    num_extra_aggregates=size(newaggs,1)-1;  % add extra columns to allow for itprod, nonit, well, poor, etc.  The "-1" is because business (71) repeats
    AggNomUse = NaN(IOyears,size(agglist,1)+num_extra_aggregates); 
    AggRealUse = AggNomUse;  % NaN of same size
    AggPrice =   AggNomUse;
    AggNomInd =  AggNomUse;

    % Now fill in the aggregates    
    for i=1:size(agglist,1) 
        eval(['agginds =' agglist{i,1} ';']);  % Shows which I-O industries get aggregated to match the MFP industry (MFP indexced by i)

        [AggRealUse(:,i), AggNomUse(:,i),AggPrice(:,i), dum] = ...
            fishagg_jgf(Realfin(:,agginds),Nomfin(:,agginds), Pricefin(:,agginds),13,0) ; % JGF modified 1/2014
%     
        AggNomInd(:,i)=sum(NOMINAL_OUTPUT_IND(:,agginds),2);
    end

    % Save column 71 as a check (will recalculate, since 71=bus in newaggs 
    AggNomUse71 = AggNomUse(:,71);
    AggRealUse71 = AggRealUse(:,71);
    AggPrice71 = AggPrice(:,71);
    AggNomInd71 = AggNomInd(:,71);
    
    
     % Now some additional subaggregates
   
     for i = 1:size(newaggs,1)
         % Do the Fisher/Chain Aggregation.  Easier to debug by doing the two sides separately.
          LHS=['[AggRealUse(:,' newaggs{i,1} '), AggNomUse(:,'  newaggs{i,1} ') AggPrice(:,'  newaggs{i,1} ') dum ]'] ;
          RHS=[ 'fishagg_jgf(AggRealUse(:,' newaggs{i,2} '),AggNomUse(:,' newaggs{i,2} '), AggPrice(:,' newaggs{i,2} '),13,0) ;'] ; 
          eval([LHS '=' RHS ';'] ) % fisher aggregation
            
          eval(['AggNomInd(:,' newaggs{i,1} ') = sum(AggNomInd(:,' newaggs{i,2} '),2);'])
     end
   
    [100*diff(log(AggRealUse71)) 100*diff(log(AggRealUse(:,71)))] ; % in growth rates, very tiny differences
     
     
     % Now calculate shares
     
     FinShare = 100*AggNomUse ./ AggNomInd;  % confirms for farming!
     
     AveFinShare = 0.5*(FinShare(2:end,:)+FinShare(1:end-1,:) );
     
     dAggRealuse = 100*diff(log(AggRealUse)) ; % Real growth
     
     sf=FinShare;  % more natural mnemonics
     ssf = AveFinShare;
     
     sfdf = AveFinShare .* dAggRealuse;

     mean(sfdf)'  ;

     % Now do a check by aggregating over the MFP categories. 
     [realout, nomout, pricout, dum] = ...
         fishagg_jgf(Realfin(:,1:169),Nomfin(:,1:169), Pricefin(:,1:169),13,0) ; 
     
     drealout = 100*diff(log(realout));
    
     % The following should be equal and it is!
     % plot([drealout dAggRealuse(:,71)])
    
    
     %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
     %     Now do some subsample calculations. 
     %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
        
     %The newest IO tables have a shorter sample (starting
     % 1997) so need to truncate dates.  Below was modified somewhat quick/dirty.
    
     if IOstartY < 1995
        finyear95 = 1995- IOstartY;  % -MFPstartY= -(MFPstartY+1)+1 .For growth rates, which start 1 year after MFPstartY
        finyear00 = 2000 - IOstartY ;
        finyear04 = 2004- IOstartY;
        finyear07 = 2007- IOstartY;
        finendyr  = IOendY - IOstartY;

        finsmpl_full = 1:(IOendY-IOstartY);
        finsmpl_pre95 =1:finyear95 ;
        finsmpl_8800 = 1:finyear00;
        finsmpl_9500 = (finyear95+1):finyear00;
        finsmpl_0004 = (finyear00+1):finyear04;
        finsmpl_9504 = (finyear95+1):finyear04;
        finsmpl_0407 = (finyear04+1):finyear07;
        finsmpl_07end = (finyear07+1):(IOendY-IOstartY);
        finsmpl_04end = (finyear04+1):(IOendY-IOstartY);
    else
        finyear95 = [] ;
        finyear00 = 2000 - IOstartY ;
        finyear04 = 2004- IOstartY;
        finyear07 = 2007- IOstartY;
        finendyr  = IOendY - IOstartY;

        finsmpl_full = 1:(IOendY-IOstartY);
        finsmpl_8890 = [];
        finsmpl_pre95 =[];%1:finyear95 ;
        finsmpl_8800 = 1:finyear00;
        finsmpl_9500 = 1:finyear00;
        finsmpl_0004 = (finyear00+1):finyear04;
        finsmpl_9504 = (finyear95+1):finyear04;
        finsmpl_0407 = (finyear04+1):finyear07;
        finsmpl_07end = (finyear07+1):(IOendY-IOstartY);
        finsmpl_04end = (finyear04+1):(IOendY-IOstartY);
        
    end
   
    
    
    subsamples = {'_full' '_pre95' '_8800' '_9500' '_0004' '_9504' '_0407' '_07end' '_04end' '_8890'};
    numSub = size(subsamples,2);
    
    varnames = {'FinShare' 'sfdf' 'dAggRealuse' 'sf'};
  
    acceldefs = {'_accel95'   '(:,6)'   '(:,2)' 
                 '_accel04'   '(:,7)'   '(:,5)'
                 '_accel9500'   '(:,4)'   '(:,2)'
                 '_accel04end'   '(:,9)'   '(:,5)'} ;
    
    colLabels = [subsamples acceldefs(:,1)']    ;     
    accelCols = size(acceldefs,1);  % columns for acceleration to be added to, say, sfdftab
    
    % Now create the actual table, e.g., FinSharetab, which will have columns for subsamples
    % and for the acceleration columns. (I think some acceleration columns will be problematic
    % in what follows)
    numsubsamples = size(subsamples,2);
    for i = 1:size(varnames,2);  % subsample means
        for j = [1:numsubsamples]   % 2 is pre95=NaN, last one 8890 is before we have finance, so that will be NaN
            definition = [ varnames{1,i} subsamples{1,j} ' = transpose( mean(' varnames{1,i} '(finsmpl' subsamples{1,j} ',:)) ) ;' ] ;
            % example: definition= 'VVAWt_0411 = transpose( mean(VVAWt(smpl_0411,:)) ) ;'
            eval(definition)
            eval( [varnames{1,i} 'tab(:,j) = ' varnames{1,i} subsamples{1,j} ';'])
        end 
        definition = [varnames{1,i} 'tab(:,' num2str(numsubsamples) ')= NaN;'] ;
        eval(definition);
    end
     
   NN = size(subsamples,2);  % Now do the acceleration measures
   for i = 1:size(varnames,2);
       for j = 1:accelCols; 
           lhs =  [varnames{1,i} acceldefs{j,1}  ] ;
           rhs1 = [varnames{1,i} 'tab' acceldefs{j,2} ] ;
           rhs2 = [varnames{1,i} 'tab' acceldefs{j,3} ] ;
           eval([lhs '=' rhs1 '-' rhs2 ';']);
           
           eval([varnames{1,i} 'tab(:,' num2str(NN+j) ') = ' lhs ';'] ) 
       end
   end
      
   
    
cd(s_drive_location)        
save('out\Section2.mat')








    
      
%     
      
%%     
%  ****************************************************************************
%  Section 3:  Read in BLS IT capital income and growth
%  ****************************************************************************

    % Cell locations are hardcoded!

    % I did a mapping in a spreadsheet, mapping-it-capital-to-bls-mfp.xlsx, between the IT measures
    % (the RHS below) and the LHS of realitcap.  The issues are that the IT capital data doesn't
    % have the aggregates in it, or crops (within agriculture), which is why there are gaps on the LHS.  On the RHS, for IT
    % measures, there's a "non-manufacturing" measure, which isn't in the MFP industries, which is
    % why there's a gap there.
    % Checking this, it seems to work
    
    % John updated 8/2015.  Now need to worry about software
    % To update:
    % - Get new BLS spreadsheets from http://www.bls.gov/mfp/mprdload.htm
    % - Change it_cap_location in Section0
    % - Check ranges, since those are hardcoded
    % - After reading the data, spot check against the original spreadsheet
    
    
    clearvars
    load('out\Section2.mat')

    

    map_mfp =  [1 3:10 12:21 23 24 26:30 32 33 35:42 44:47 49:54 56:71]; % mfp industries 
    map_it   = [4:22 2 24:64 1];  % it industries that correspond to mfp
    assert(size(map_mfp,2)==size(map_it,2))

    cd(it_cap_location)

 %  Information Capital and Related Measures 1987 - 2013																												
 																												
 
   % Next step is to get nominal payments/cost of IT (levels, billions).  
   % Cell locations are hardcoded, and differ for first table from others (off by 1 row)
   % Note transpose
    raw_itcap   =( xlsread('itcapbymeasure.xlsx','1.2','C8:AD71') )';   % Real Capital Input,  Indexes = 100.000, Base Year = 2009
    raw_itnom   =( xlsread('itcapbymeasure.xlsx','3.1','C7:AD70') )';   %   Capital Income (Billions of Dollars)
    raw_realinv =( xlsread('itcapbymeasure.xlsx','7.1','C7:AD70') )';   %  Gross Investment (Billions of 2009 Dollars)        
    raw_priceinv  =( xlsread('itcapbymeasure.xlsx','8.1','C7:AD70') )';   % Price Deflator, 2009=1

    % Note:    We have to map the IT list to the MFP list, which is what we do now.  
    % The mapping introduces a lot of zeros for aggregates that we'll fill in below when
    % we chain-aggregate.
    
    % Now do the mapping
    realitcap_hard(:,map_mfp)=raw_itcap(:,map_it);
    nomitcap_hard(:,map_mfp)=raw_itnom(:,map_it);

    realitinv_hard(:,map_mfp)=raw_realinv(:,map_it);
    priceitinv_hard(:,map_mfp)=raw_priceinv(:,map_it); % price deflator for investment

    priceitcap_hard = nomitcap_hard./realitcap_hard;  % this will be a user-cost weight
    assert(any(any(isnan(priceitcap_hard))), 'NaNs in priceitcap_hard')  % Shouldn't be any NaNs
    nomitinv_hard   = realitinv_hard.*priceitinv_hard;

    
   % now do the same thing for IPP.  Only need software so locations are quite different from
   % ones above.
    
    raw_softcap   =( xlsread('ippcapbymeasure.xlsx','1.2','C162:AD225') )';   % Real Capital Input,  Indexes = 100.000, Base Year = 2009
    raw_softnom   =( xlsread('ippcapbymeasure.xlsx','3.1','C159:AD222') )';   %   Capital Income (Billions of Dollars)
    raw_realsoftinv =( xlsread('ippcapbymeasure.xlsx','7.1','C159:AD222') )';   %  Gross Investment (Billions of 2009 Dollars)        
    raw_pricesoftinv  =( xlsread('ippcapbymeasure.xlsx','8.1','C159:AD222') )';   % Price Deflator, 2009=1
    
        % Now do the mapping
    realitcap_soft(:,map_mfp)=raw_softcap(:,map_it);
    nomitcap_soft(:,map_mfp)=raw_softnom(:,map_it);

    realitinv_soft(:,map_mfp)=raw_realsoftinv(:,map_it);
    priceitinv_soft(:,map_mfp)=raw_pricesoftinv(:,map_it);

    priceitcap_soft = nomitcap_soft./realitcap_soft;    
    priceitcap_soft(:,  any(isnan(  priceitcap_soft )) ) = 0 ; % Dealing with NaNs in the ratio
                                                % since NaNs will cause problems with chaining
                                                % Note: the problems are cases where we calculate 0/0
    nomitinv_soft   = realitinv_soft.*priceitinv_soft;

    
    
    % Now need to chain-weight hardware and software to get the "old" definition (IT capital is
    % now just hardware).  Note:  Won't normalize index to 100 in base year, but that doesn't
    % matter. The LHS names in the fishag_jgf procedure will correspond to the broader 
    % definition that we want (and had previously...I hope!  jgf 8/2015)
    
    for i = 1:size(realitcap_hard,2)
        % to keep the call to fishagg_jgf relatively readable, I define the terms that we're combining here 
        realcap = [realitcap_hard(:,i) realitcap_soft(:,i)];
        nomcap =  [nomitcap_hard(:,i) nomitcap_soft(:,i)];
        pricecap = [priceitcap_hard(:,i) priceitcap_soft(:,i)];
        
        [realitcap(:,i), nomitcap(:,i), priceitcap(:,i), dum] = ...
           fishagg_jgf(realcap, nomcap, pricecap,23,0) ; % JGF modified 8/2015
   
        realinv = [realitinv_hard(:,i) realitinv_soft(:,i)];
        nominv =  [nomitinv_hard(:,i) nomitinv_soft(:,i)];
        priceinv = [priceitinv_hard(:,i) priceitinv_soft(:,i)];
              
       [realitinv(:,i), nomitinv(:,i), priceitinv(:,i), dum] = ... 
           fishagg_jgf(realinv, nominv, priceinv,23,0) ;
    
    end
    
   
    % Fill in some NaNs, e.g. for aggregates.  Note:  Won't normalize index to 100 in base year, but that doesn't
    % matter.  Note that Fill_in was defined in Section 1 and includes subaggregates (Trade,
    % MF, etc) and special aggregates (IT producing, etc).
    % Note also that all industry codes are the MFP codes
        
    for i = 1:size(Fill_in,1)
       eval(['mapto   =' Fill_in{i,1} ';'])
       eval(['mapfrom =' Fill_in{i,2} ';'])

       % Now use the mapping to Chain-aggregate
       [realitcap(:,mapto), nomitcap(:,mapto), priceitcap(:,mapto), dum] = ...
           fishagg_jgf(realitcap(:,mapfrom), nomitcap(:,mapfrom), priceitcap(:,mapfrom),23,0) ; % JGF modified 1/2014
    
       [realitinv(:,mapto), nomitinv(:,mapto), priceitinv(:,mapto), dum] = ... 
           fishagg_jgf(realitinv(:,mapfrom), nomitinv(:,mapfrom), priceitinv(:,mapfrom),23,0) ; % BW: changed base year from 19 to 23 (2005 to 2009)
    
    end

    

    % Done with filling in NaNs in the IT data
    
    dkit=diff(log(realitcap));
    dinvit = diff(log(realitinv));
        

    % Now I need to calculate IT shares.  Calc relative to VA, which I have for full sample
    skv = 100*nomitcap ./ NomVA;   % it share of value added (bad mnemonic.  SKV is already defined as capital's share)
    sskv = (skv(1:end-1,:)+skv(2:end,:))/2; 
    skdk = sskv.*dkit;
    
    sinv = 100*nomitinv ./NomVA;
    ssinv= (sinv(1:end-1,:)+sinv(2:end,:))/2;
    sidi = ssinv.*dinvit;
    
    
     % Now do some subsample calculations
     
     
    year90 = 1990- MFPstartY; 
    year95 = 1995- MFPstartY;  % -MFPstartY= -(MFPstartY+1)+1 .For growth rates, which start 1 year after MFPstartY
    year00 = 2000 - MFPstartY
    year04 = 2004- MFPstartY;
    year07 = 2007- MFPstartY;
    endyr  = MFPendY - MFPstartY;
    
    smpl_full = 1:(MFPendY-MFPstartY);
    smpl_pre95=1:year95 ;
    
    smpl_8890 = 1:year90;
    smpl_8800 = 1:finyear00; 
    smpl_9500 = (year95+1):year00;
    smpl_0004 = (year00+1):year04;
    smpl_9504 = (year95+1):year04;
    smpl_0407 = (year04+1):year07;
    smpl_07end = (year07+1):(MFPendY-MFPstartY); % Calculate to end of sample
    smpl_04end = (year04+1):(MFPendY-MFPstartY); % Calculate to end of sample
    
    
    % Now do subsample means for contribs, TFP growth, and weights, etc
    % Change the label to last year/end point of the sample in lines 8, 9, and 14
    
    colLabels = {'Full sample'   % 1               
                 'Pre-1995',     % 2
                 '1988-00',      % 3
                 '1995-2000',    % 4
                 '2000-2004',    % 5    
                 '1995-2004',    % 6
                 '2004-07',      % 7
                 '2007-14',      % 8
                 '2004-14',      % 9
                 '1988-90'      %10
                 'Chng after 1995 (to 2004)'             % 11  
                 'Chng after 2004 (04-07 less 95-04)'    % 12 
                 'Chg 95-00 from pre95'                  % 13
                 'Chng after 2004 (04-14 less 95-04' };  %14
                 
                 
    
    
    
    subsamples = {'_full' '_pre95'  '_8800'  '_9500' '_0004' '_9504' '_0407' '_07end' '_04end' '_8890'};
    numSub = size(subsamples,2);
    
    varnames = {'contrib' 'dTFPv' 'VVAWt' 'skdk' 'sskv' 'ssinv' 'sidi'};
  
    acceldefs = {'_accel95'   '(:,6)'   '(:,2)'   % col 1 is accel label, which will be the second column less the first
                 '_accel04'   '(:,7)'   '(:,5)'
                 '_accel9500'   '(:,4)'   '(:,2)'
                 '_accel04end'   '(:,9)'   '(:,5)'} ;
 
             
             
             
             
    colLabels = [subsamples acceldefs(:,1)']   ;      
    accelCols = size(acceldefs,1);  % columns for acceleration
    contribtab = NaN(size(contrib,2),numSub+accelCols);
    dTFPVtab = contribtab;
    VVAWttab = contribtab;

    for i = 1:size(varnames,2);
        for j = 1:size(subsamples,2)
            definition = [ varnames{1,i} subsamples{1,j} ' = transpose( mean(' varnames{1,i} '(smpl' subsamples{1,j} ',:)) ) ;' ] ;
            % example: definition= 'VVAWt_0411 = transpose( mean(VVAWt(smpl_0411,:)) ) ;'
            eval(definition)
            eval( [varnames{1,i} 'tab(:,j) = ' varnames{1,i} subsamples{1,j} ';'])
        end        
    end

   NN = size(subsamples,2);
   for i = 1:size(varnames,2);
       for j = 1:accelCols; 
           lhs =  [varnames{1,i} acceldefs{j,1}  ] ;
           rhs1 = [varnames{1,i} 'tab' acceldefs{j,2} ] ;
           rhs2 = [varnames{1,i} 'tab' acceldefs{j,3} ] ;
           eval([lhs '=' rhs1 '-' rhs2 ';']);
           
           eval([varnames{1,i} 'tab(:,' num2str(NN+j) ') = ' lhs ';'] ) 
       end
   end
        
    
    
    cd(s_drive_location)

        


cd(s_drive_location)        
save('out\Section3.mat')





    
%%     
%  ****************************************************************************
%  Section 4:  Now do calculations on the data
%  ****************************************************************************


%  Identify IT using/non-IT using industries by sorting sskvtab(:,business) by IT
%  intensity.  Also finance intensive
%  These need to be calculated after running the code above.  But then the lists are inputs
%  into data calculations done earlier.  So they were done here ONCE, then pasted to the top of
%  the paper, e.g., it_int_drop, 
%  My baseline used the full sample average shares.  Arguably, should use early shares (prior 
% the period under consideration).  Doesn't make too much difference to results, when I checked 
% after the fact.  Relative to early, using full sample adds 8 (pet and coal prod), 35 (air trans); 
% drops 18 (el eq) and 39 (transit)
% late sample:   8  9 16    21 29 32 35 37    40 45 46 47 58 59 60    63
% early smpl:       9 16 18 21 29 32    37 39 40 45 46 47 58 59 60 62 63 64


    clearvars
    load('out\Section3.mat')

    calc_shares = true;
        if calc_shares;

            
       % IT intensive based on sskv (the standard way)
            
            ITcutoff = 0.5;
            ssk_to_use = sskv_9500;
            [Temp ITindex]=sort(ssk_to_use, 'descend') ; % just care about the ITindex
            sskvbus=ITindex(ismember(ITindex,business));  % drop the aggregates
            % [(1:60)' sskvbus cumsum(VVAWt_full(sskvbus))]
            cutoff_sskv=find(cumsum(VVAWt_full(sskvbus))>ITcutoff,1);  % Figure out 50 percent weight
            it_int_calc = sskvbus(1:cutoff_sskv) ;% not sorted
            notit_int_calc = sskvbus(cutoff_sskv+1:end); % not sorted.  won't use

        %  IT intensive drop
            sskvbus_drop=ITindex(ismember(ITindex,narrowest_business));  % drop the aggregates
            
            sum_drop=cumsum(VVAWt_full(sskvbus_drop));
            aaa=[(1:size(sum_drop,1))' sskvbus_drop ssk_to_use(sskvbus_drop) sum_drop/sum_drop(end) contribtab(sskvbus_drop,12) cumsum(contribtab(sskvbus_drop,12))]  ;
            aaa = addLabels(aaa,agglist(sskvbus_drop,3), {' ','industry', 'ssk', 'weight', 'post-04 slowdn' 'cum slowdown'});
            
            cutoff_sskv_drop=find( (sum_drop/sum_drop(end))>ITcutoff,1);  % Figure out 50 percent weight
            it_int_drop_sskv_calc = sskvbus_drop(1:cutoff_sskv_drop) ;% not sorted
            %notit_int_drop = sskvbus_drop(cutoff_sskv_drop+1:end); % not sorted.  won't use
            
            % it_int_drop=[  46 47 59 37  9 58 63 40 29 45 32 16 60 21 8 35 64 62 18 39 19 3 61  6 13 ]
        
            
            
% Now using skdk (the BFOS definition)

            ITcutoff = 0.5;
            skdk_to_use = skdk_0407-skdk_0004;
            [Temp ITindex]=sort(skdk_to_use, 'descend') ; % just care about the ITindex
            skdkbus=ITindex(ismember(ITindex,business));  % drop the aggregates
            % [(1:60)' sskvbus cumsum(VVAWt_full(sskvbus))]
            cutoff_skdk=find(cumsum(VVAWt_full(skdkbus))>ITcutoff,1);  % Figure out 50 percent weight
            it_int_skdk_calc = sskvbus(1:cutoff_skdk) ;% not sorted
            notit_int_skdk_calc = skdkbus(cutoff_skdk+1:end); % not sorted.  won't use

        %  IT intensive drop
            skdkbus_drop=ITindex(ismember(ITindex,narrowest_business));  % drop the aggregates
            
            sum_drop=cumsum(VVAWt_full(skdkbus_drop));
            bbb=[(1:size(sum_drop,1))' skdkbus_drop skdk_to_use(skdkbus_drop) sum_drop/sum_drop(end) contribtab(skdkbus_drop,12) cumsum(contribtab(skdkbus_drop,12)) ] ;
            bbb = addLabels(bbb,agglist(skdkbus_drop,3), {' ','industry', 'skdk', 'cum weight', 'post-04 slowdn' 'cum slowdown'});
            
            cutoff_skdk_drop=find( (sum_drop/sum_drop(end))>ITcutoff,1);  % Figure out 50 percent weight
            it_int_drop_skdk_calc = skdkbus_drop(1:cutoff_skdk_drop) ;% not sorted
            %notit_int_drop = sskvbus_drop(cutoff_skdk_drop+1:end); % not sorted.  won't use
           
           % 'Industries that are in it_int based on skdk but not based on sskv'
           % it_int_drop_skdk_calc(~ismember(it_int_drop_skdk_calc, it_int_drop_sskv_calc))
            
           %  'Industries that are in it_int based on sskv but are not based on skdk'
           % it_int_drop_sskv_calc(~ismember(it_int_drop_sskv_calc, it_int_drop_skdk_calc))
            
            % default here is it_int_drop uses skdk.  But in the main text I can do it the
            % other way (sskv)
            
            
%   Now do finance          

            [Temp Finindex]=sort(FinShare_full, 'descend') ; % Full sample average.  just care about the Finindex
            %bus_ex_finance = business(~ismember(business,[49:52])) ;%don't want finance industries which are hugely finance intensive
            FinSharebus=Finindex(ismember(Finindex,non_finance) );  % Just want the non-finance business industries (56 of them) 
            
            sum_fin = cumsum(VVAWt_full(FinSharebus) );
            cutoff_Fin=find( ( sum_fin/sum_fin(end) ) >0.5,1);  % Figure out 50 percent weight
            fin_int_calc = FinSharebus(1:cutoff_Fin) ;% not sorted
            notfin_int_calc = FinSharebus(cutoff_Fin+1:end); % not sorted.  won't use
            % fin_int=[52 51 49 50 36 70 53 59 54 65 23 38 28 37 61  67 40 60 63 56 44 58 64 27 66 68]

            
            FinSharebus_drop=Finindex(ismember(Finindex,narrowest_business) );  % Just want the business industries (60 of them) 
            sum_drop_fin = cumsum(VVAWt_full(FinSharebus_drop) );
            cutoff_Fin_drop=find( (sum_drop_fin/sum_drop_fin(end) )>0.5,1);  % Figure out 50 percent weight
 
            fin_int_drop_calc = FinSharebus_drop(1:cutoff_Fin_drop) ;% not sorted
            notfin_int_drop_calc = FinSharebus_drop(cutoff_Fin_drop+1:end); % not sorted.  won't use
        
            % fin_int_drop = [ 36 70 59 65 38 37 61 67 40 60 63 56 58 64 66 68 33 39 32 45 35 20 15 69 29 21 47]
        
   

        end  % end of calculations of shares.  If run, can update the list back in Section 0
        
        
        

cd(s_drive_location)        
save('out\Section4.mat')






clearvars
load('out\Section4.mat')




% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%   Tables

%  List of industry names for tables.  Agglist had the BLS MFP names.  newaggs has a bunch of
%  new aggregates that I created based on the BLS industries

    namelist_noindent = [agglist(1:end-1,3) ; newaggs(:,3)];  % agglist(end,3) repeats newaggs(1,3)

% Indentation for formatting writeouts
% Default is no indentation
    namelist = namelist_noindent;


% Most.  Default is all individual industries, but sometimes written over below where needed
for i = [business 80:89 90 91 92 98 106 ]
   namelist{i,1}=['      '  namelist_noindent{i,1} ] ;
end


% Medium
for i =  [1 2 11 22 25 29 31 34 43 46 48 55 88 100 101]
    namelist{i,1}=['    '  namelist_noindent{i,1} ] ;
end

% Smallest indent
for i = [ 102 72:79 94 95 96 97 103 104 105   ]% 2
    namelist{i,1}=['  '  namelist_noindent{i,1} ] ;
end










% What's in lists?  I have a bunch of lists, and I want to know what's in them.  So I create
% a matrix called listmapping that shows the lists, with X's to mark what's in them
 
 
listtoprint={
'itprod'
'nonit'
'well'
'poor'
'it_int'
'notit_int'
'narrow_business'
'narrowest_business'
'non_it_drop'
'it_int_drop'
'notit_int_drop'
'fin_int_drop'
'notfin_int_drop'
'well_drop'
'poor_drop'};


    
listmapping=cell(1+70,2+size(listtoprint,1));
listmapping(2:end,1)=num2cell((1:70)' )  ;
listmapping(2:end,2)=namelist(1:70, 1);
listmapping(1,3:end)=listtoprint';

for i = 1:size(listtoprint,1)
  list = eval(listtoprint{i,1});
    for j = 1:70
        if ismember(j,list);
           listmapping{j+1,i+2} = 'X';
        else
            listmapping{j+1,i+2} = [];
        end
    end    
end

    
    


% Summary tables in style of Basu-Fernald 2008 or BFOS 2003
% Change labels (line 8, 9, 14) to last year in sample
           
    
    colSelect = [2 4 5 7:9 12 14 ];
    colLabels = {'Full sample'   % 1
                 '1987-1995',     % 2
                 '1988-00',      % 3
                 '1995-2000',    % 4
                 '2000-2004',    % 5    
                 '1995-2004',    % 6
                 '2004-07',      % 7
                 '2007-14',      % 8
                 '2004-14',      % 9
                 '1988-1990'    % 10
                 'Chng after 1995 (to 2004)'             % 11  
                 'Chng after 2004 (04-07 less 00-04)'    % 12 
                 'Chg 95-00 from pre95'                  % 13
                 'Chng after 2004 (04-14 less 00-04'} ;  % 14

    col_labs = colLabels(colSelect);

    % Define industries/subaggregates for a table of 'aggregates'
    agtabs= [71 1 2 11 22 25 29 30 31 34 43 86 87 48 88 89 55 90:92 72:79 93:97 124 125 104 105] ;  %industries & various aggregates

    tab_dTFPv_ag    = addLabels(dTFPvtab(   agtabs, colSelect),namelist(agtabs), col_labs);
    tab_contrib_ag  = addLabels(contribtab( agtabs, colSelect),namelist(agtabs), col_labs);
    tab_VVAWt_ag    = addLabels(VVAWttab(  agtabs, colSelect),namelist(agtabs), col_labs);
    tab_sskv_ag     = addLabels(sskvtab(  agtabs, colSelect),namelist(agtabs), col_labs);
    tab_finshare_ag     = addLabels(FinSharetab(  agtabs, colSelect),namelist(agtabs), col_labs);
    tab_fingrowth_ag     = addLabels(dAggRealusetab(  agtabs, colSelect),namelist(agtabs), col_labs);
    tab_skdk_ag     = addLabels(skdktab(  agtabs, colSelect),namelist(agtabs), col_labs);
    
    

    % Detailed industry tables
    dettabs= [1:size(namelist,1)] ;% [71 business] ;  %industries & aggregates
    
    tab_dTFPv_det    = addLabels(dTFPvtab(   dettabs, colSelect),namelist(dettabs), col_labs);
    tab_contrib_det  = addLabels(contribtab( dettabs, colSelect),namelist(dettabs), col_labs);
    tab_VVAWt_det    = addLabels(VVAWttab(   dettabs, colSelect),namelist(dettabs), col_labs);
    tab_sskv_det     = addLabels(sskvtab(    dettabs, colSelect),namelist(dettabs), col_labs);
    tab_finshare_det  = addLabels(FinSharetab(dettabs, colSelect),namelist(dettabs), col_labs);
    tab_fingrowth_det  = addLabels(dAggRealusetab(dettabs, colSelect),namelist(dettabs), col_labs);
    tab_skdk_det  = addLabels(skdktab(dettabs, colSelect),namelist(dettabs), col_labs);

    
    % Some alternative 'detailed' tables, e.g., finance and real estate are separate
    % These will be the main ones I focus on

    altdettabs= [71 102 100 101 30 89 88 ...   % total business and NR, construction, FIRE
                    93 72 17 44 57 103 94 95  ... % narrow bus, IT prod, non-IT prod, IT-intensive and not
                    104 2 106 98 82 29 31:33 46 34 ...                % Well
                    105  87   55    90    91    92  ...             % Poor
                    96 97                               ] ; % fin-intensive/not
    
        
    tab_dTFPv_altdet    = addLabels(dTFPvtab(   altdettabs, colSelect),namelist(altdettabs), col_labs);
    tab_contrib_altdet  = addLabels(contribtab( altdettabs, colSelect),namelist(altdettabs), col_labs);
    tab_VVAWt_altdet    = addLabels(VVAWttab(   altdettabs, colSelect),namelist(altdettabs), col_labs);
    tab_sskv_altdet     = addLabels(sskvtab(    altdettabs, colSelect),namelist(altdettabs), col_labs);
    tab_finshare_altdet  = addLabels(FinSharetab(    altdettabs, colSelect),namelist(altdettabs), col_labs);
    tab_fingrowth_altdet  = addLabels(dAggRealusetab(    altdettabs, colSelect),namelist(altdettabs), col_labs);
    tab_skdk_altdet  = addLabels(skdktab(    altdettabs, colSelect),namelist(altdettabs), col_labs);
    
    % Want some different formatting for the matrices where I write everything out
    temp_dTFPv_alldata  = addLabels([(1:size(dTFPv,2)); dTFPv],num2cell(MFPstartY:MFPendY), namelist_noindent) ;
    temp_dY_alldata     = addLabels([(1:size(dTFPv,2)); dY],   num2cell(MFPstartY:MFPendY), namelist_noindent) ;
    temp_dLabor_alldata = addLabels([(1:size(dTFPv,2)); dHrs], num2cell(MFPstartY:MFPendY), namelist_noindent) ;;  % Is this hours or is it labor?
    temp_dK_alldata     = addLabels([(1:size(dTFPv,2)); dK],   num2cell(MFPstartY:MFPendY), namelist_noindent) ;
    temp_SKV_alldata    = addLabels([(1:size(dTFPv,2)); SSKV], num2cell(MFPstartY:MFPendY), namelist_noindent) ;  % average capital's share in t, t-1

    % Get rid of the phantom '1987' in the row where I have the industry number
    temp_dTFPv_alldata{2,1} = [];
    temp_dY_alldata{2,1} = [];
    temp_dLabor_alldata{2,1} = [];
    temp_dK_alldata{2,1} = [];
    temp_SKV_alldata{2,1} = [];
    
    % Now switch the first two rows
    tab_dTFPv_alldata = temp_dTFPv_alldata([2 1 3:end],:) ;
    tab_dY_alldata = temp_dY_alldata([2 1 3:end],:) ;
    tab_dLabor_alldata = temp_dLabor_alldata([2 1 3:end],:) ;
    tab_dK_alldata = temp_dK_alldata([2 1 3:end],:) ;
    tab_SKV_alldata = temp_SKV_alldata([2 1 3:end],:) ;
    
    
    
    
%    tab_dTFPv_alldata = addLabels([(1:size(dTFPv,2)); dTFPv],num2cell(MFPstartY:MFPendY), namelist) ;
%     tab_dY_alldata = addLabels(dY,num2cell(MFPstartY+1:MFPendY), namelist) ;
%     tab_dLabor_alldata = addLabels(dHrs,num2cell(MFPstartY+1:MFPendY), namelist) ;  % Is this hours or is it labor?
%     tab_dK_alldata = addLabels(dK,num2cell(MFPstartY+1:MFPendY), namelist)    ;
%     tab_SKV_alldata = addLabels(SKV(2:end,:),num2cell(MFPstartY+1:MFPendY), namelist)    ; % capital's share, start year 2...(should maybe do averages)?
%     


    % columns of dTFPv for "Table1"
    
    table_1_cols=[2 4 5 7 8 12];
    table_1_dat = [dTFPvtab( altdettabs, table_1_cols ) 100*VVAWttab(altdettabs, 1)];
    table_1_labels = [colLabels(table_1_cols); 'VA Weight (Avg, 1988-2012)'] ;
    table_1 = addLabels( table_1_dat, namelist(altdettabs), table_1_labels);
    
    
    % New table for FHSW
    table_fhsw_cols = [2 4 5 7 8 12];
    newtable_rows = [71 ... % bus
                     88  ...% finance
                     128 ...% energy
                     132 ...% transportation
                     30  ...% construction
                     72 ... % it prod              
                     129 ...% excluding finance
                     78 ... % finance-intensive
                     79 ... % non-finance intensive
                     131 ... % bus x finance and it prod
                     94 ...  % IT-intensive_drop
                     95 ... % non-IT-intensive drop
                     96 ...  % fin_intensive drop
                     97 ...  % non-finance-intensive
                     130 ... % regulatory list
                     
        ] ;
    
    table_fhsw_dat = [dTFPvtab( newtable_rows, table_1_cols ) 100*VVAWttab(newtable_rows, 1)];
    table_fhsw_labels = [colLabels(table_1_cols); 'VA Weight (Avg, 1988-2014)'] ;
    table_fhsw = addLabels( table_fhsw_dat, namelist_noindent(newtable_rows), table_fhsw_labels);
    
     
    
% Write out tables to Excel



regfile = [s_drive_location filesep 'out' filesep 'prod_tables.' this_date ];  % filename to use
if exist([regfile '.xlsx'], 'file'); % if exists, then create ...v1.m
            run_number = 1;
            while exist([regfile '.v' num2str(run_number) '.xlsx'], 'file')
                run_number = 1 + run_number;
            end % while
            regfile = [regfile '.v' num2str(run_number) ];
end

regfile = [regfile '.xlsx'];
copy_success= copyfile(['data' filesep 'Prod_table_template.xlsx'], regfile);   % will write to a template
 
% Add a date label to the page used for the bar chart
% The bar charts are in a different spreadsheet, but all the data are on this particular page from regfile
 tab_contrib_ag{1,1} = ['run ' datestr(now, 'mm.dd.yyyy HH:MM AM')];
 tab_contrib_ag{end+2,1}=['copied from file ' regfile];

count=0;
xlswrite(regfile, listmapping, 'listmapping'); count=count+1;
xlswrite(regfile, table_1, 'Table 1'); count=count+1;

% Removed a couple from earlier versions of this code
%xlswrite(regfile, tab_plotdata, 'Data for plots'); count=count+1;
%xlswrite(regfile, bin_data, 'Data for portfolio plot'); % don't add to count, it's part of init_sheets

xlswrite(regfile, tab_dTFPv_ag,     'tab_dTFPv_ag');    count=count+1;
xlswrite(regfile, tab_dTFPv_altdet, 'tab_dTFPv_altdet');count=count+1;
xlswrite(regfile, tab_dTFPv_det,    'tab_dTFPv_det');   count=count+1;

xlswrite(regfile, tab_contrib_ag,     'tab_contrib_ag');count=count+1;
xlswrite(regfile, tab_contrib_altdet, 'tab_contrib_altdet');count=count+1;
xlswrite(regfile, tab_contrib_det,    'tab_contrib_det');count=count+1;

xlswrite(regfile, tab_VVAWt_ag,     'tab_VVAWt_ag');    count=count+1;
xlswrite(regfile, tab_VVAWt_altdet, 'tab_VVAWt_altdet');count=count+1;
xlswrite(regfile, tab_VVAWt_det,    'tab_VVAWt_det');   count=count+1;

xlswrite(regfile, tab_skdk_ag,     'tab_skdk_ag');     count=count+1;
xlswrite(regfile, tab_skdk_altdet, 'tab_skdk_altdet'); count=count+1;
xlswrite(regfile, tab_skdk_det,    'tab_skdk_det');    count=count+1;

xlswrite(regfile, tab_sskv_ag,     'tab_sskv_ag');     count=count+1;
xlswrite(regfile, tab_sskv_altdet, 'tab_sskv_altdet'); count=count+1;
xlswrite(regfile, tab_sskv_det,    'tab_sskv_det');    count=count+1;
    
xlswrite(regfile, tab_fingrowth_ag,     'tab_fingrowth_ag');count=count+1;
xlswrite(regfile, tab_fingrowth_altdet, 'tab_fingrowth_altdet');count=count+1;
xlswrite(regfile, tab_fingrowth_det,    'tab_fingrowth_det');count=count+1;

xlswrite(regfile, tab_finshare_ag,     'tab_finshare_ag');    count=count+1;
xlswrite(regfile, tab_finshare_altdet, 'tab_finshare_altdet');count=count+1;
xlswrite(regfile, tab_finshare_det,    'tab_finshare_det');   count=count+1;

xlswrite(regfile, table_fhsw,    'table_fhsw');   count=count+1; % Fernald-Hall-Stock-Watson

xlswrite(regfile, tab_dTFPv_alldata,    'tab_dTFPv_alldata');   count=count+1;
xlswrite(regfile, tab_dY_alldata,    'tab_dY_alldata');   count=count+1;
xlswrite(regfile, tab_dLabor_alldata,    'tab_dLabor_alldata');   count=count+1;
xlswrite(regfile, tab_dK_alldata,    'tab_dK_alldata');   count=count+1;
xlswrite(regfile, tab_SKV_alldata,    'tab_SKV_alldata');   count=count+1;









    

% Now do a little cleaning up of the Excel files
%  There's a little documentation at 
%  http://www.mathworks.com/access/helpdesk/help/techdoc/ref/actxserver.html
%  What follows is mainly copied from code that Kyle wrote in quarterlyCapital.m
%  JGF modified the column width, basically by trying it out


% if needed: Will close Excel so make sure files are saved!!!
% system(' taskkill /F /IM excel.exe ')


[STATUS,SHEETS]=xlsfinfo(regfile) ;


numFormat = '0.00';
H = actxserver('Excel.Application');
excelWorkbook = H.workbooks.Open([ regfile ]);

worksheets = H.sheets;

if ~copy_success   % The copied template doesn't have the extra sheets in front.  If failed, delete
    worksheets.Item(1).Delete; % drop the first the sheets: 'Sheet1', 'Sheet2', 'Sheet3'
    worksheets.Item(1).Delete;
    worksheets.Item(1).Delete;
end
    
% Now we'll reformat a bit
% 11/13/09 JGF--I don't understand the coding, but it might be VBA related?
% http://msdn.microsoft.com/en-us/library/aa224873(office.11).aspx
% 
    
%excelWorkbook.ActiveSheet.PageSetup.Orientation=2; % Landscape--seems to work

init_sheets = 5 ; % initial sheets in the template that aren't written by Matlab so don't need formatting
                  

 i=init_sheets+1;                 
 worksheets.Item(i).Activate;  % Listmapping
 excelWorkbook.ActiveSheet.Cells.get('Range','B1:AA2000').numberFormat = '0';
 excelWorkbook.ActiveSheet.Cells.get('Range','B1:B2000').ColumnWidth = '70';  % This works for listmapping
 
 Range = get(worksheets.Item(i),'Range','C3');  %Freeze panes 
 Range.Activate; H.ActiveWindow.FreezePanes = 1;

 
 worksheets.Item(init_sheets+2).Activate;  % Table 1
 excelWorkbook.ActiveSheet.Cells.get('Range','B1:AA2000').numberFormat = numFormat;
 excelWorkbook.ActiveSheet.Cells.get('Range','G1:G2000').numberFormat = '0.0'; % VA Weight 
 excelWorkbook.ActiveSheet.Cells.get('Range','A1:A2000').ColumnWidth = '46';
 excelWorkbook.ActiveSheet.Cells.get('Range','G1:G2000').ColumnWidth = '14';
 excelWorkbook.ActiveSheet.Cells.get('Range','A1:Z1').WrapText = true; 
 excelWorkbook.ActiveSheet.Cells.get('Range','G1:G2000').HorizontalAlignment = -4152 ; % right justify?

 Range = get(worksheets.Item(init_sheets+2),'Range','B3');  %Freeze panes 
 Range.Activate;  H.ActiveWindow.FreezePanes = 1;
  
 
 % Bing 5/20/15: this formatting bit didn't work for me...but sometimes the COM server has issues on my end
for i=(init_sheets+3):(init_sheets+count);  % 1 because of readme and formatted table    
    worksheets.Item(i).Activate;
    excelWorkbook.ActiveSheet.Cells.get('Range','B1:EZ2000').numberFormat = numFormat;
    excelWorkbook.ActiveSheet.Cells.get('Range','A1:A2000').ColumnWidth = '46';
    excelWorkbook.ActiveSheet.Cells.get('Range','A1:Z1').WrapText = true; 

    Range = get(worksheets.Item(i),'Range','B3');  %Freeze panes for each regression table
    Range.Activate;
    H.ActiveWindow.FreezePanes = 1;
end

worksheets.Item(1).Activate;  % Readme
excelWorkbook.Save;
excelWorkbook.Close(false);
H.Quit;

delete(H)


% End of Excel cleanup






% Now run the code to do regdata


